Provenance Management for SPARQL Updates

نویسنده

  • Argyro Avgoustaki
چکیده

During the last few years we have witnessed an explosion in the publication of data in the Web, mainly in the form of Linked Data. Scienti c, corporate or even governmental data are made available for open access and used by applications, individual users and communities. Given the increasing amount and the heterogeneity of this data, it is of crucial importance to be able to track its provenance. Recording the provenance can help us to e ectively support trustworthiness, accountability and repeatability in the Web of Data. A number of models have already been proposed to capture the provenance information of query results; most of them considering RDF or relational data. On the contrary, despite its importance, little research has been conducted in the case of updates and especially of SPARQL updates. In this thesis, we propose a new provenance model that borrows from both how and where data provenance models, and is suitable for capturing the triple and attribute level provenance of SPARQL update results. To the best of our knowledge, this is the rst model that deals with the provenance of SPARQL updates using algebraic provenance expressions, in the spirit of the well-established model of provenance semirings. On the algorithmic side, we introduce an algorithm that records the provenance of SPARQL update results in terms of the proposed model and a reconstruction algorithm that uses the provenance of a quadruple to identify a SPARQL update that is provably compatible to the original one. A SPARQL update is compatible to another if they di er only in the variables names that they employ and the rst update contains a genuine subset of the unions that appear in the second one. The latter algorithm is a necessary complement in order to fully describe the provenance management, as it shows the determinant role of provenance information in the persistence of SPARQL update results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Provenance for SPARQL Update

While the Semantic Web currently can exhibit provenance information by using the W3C PROV standards, there is a “missing link” in connecting PROV to storing and querying for dynamic changes to RDF graphs using SPARQL. Solving this problem would be required for such clear use-cases as the creation of version control systems for RDF. While some provenance models and annotation techniques for stor...

متن کامل

Scientific Workflow Provenance Metadata Management Using an RDBMS

Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power...

متن کامل

Scientific Workflow Provenance Metadata Management Using an RDBMS-based RDF Store

Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of Semantic Web technologies with the storage and querying power...

متن کامل

Improved Dataset Coverage and Interoperability with Bio2RDF Release 2

Bio2RDF is an open source project that uses Semantic Web technologies to create and provide the largest network of Linked Data for the life sciences. Here, we present the second release of the Bio2RDF project which features updated, open-source scripts, a resource registry for IRI mapping and normalization, dataset provenance, data metrics, downloadable RDF data files and Virtuoso SPARQL endpoi...

متن کامل

RDFProv: A relational RDF store for querying and managing scientific workflow provenance

Article history: Received 12 October 2008 Received in revised form 8 March 2010 Accepted 11 March 2010 Available online 23 March 2010 Provenance metadata has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. The provenance management problem concerns the efficiency and effectiveness of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015